Transgenic fish with tissue-specific expression

ABSTRACT

Disclosed are transgenic fish, and a method of making transgenic fish, which express transgenes in stable and predictable tissue- or developmentally-specific patterns. The transgenic fish contain transgene constructs with homologous expression sequences. Also disclosed are methods of using such transgenic fish. Such expression of transgenes allow the study of developmental processes, the relationship of cell lineages, the assessment of the effect of specific genes and compounds on the development or maintenance of specific tissues or cell lineages, and the maintenance of lines of fish bearing mutant genes.

BACKGROUND OF THE INVENTION

The disclosed invention is generally in the field of transgenic fish,and more specifically in the area of transgenic fish exhibitingtissue-specific expression of a transgene.

Transgenic technology has become an important tool for the study of geneand promoter function (Hanahan, Science 246:1265-75 (1989); Jaenisch,Science 240:1468-74 (1988)). The ability to express, and study theexpression of, genes in whole animals can be facilitated by the use oftransgenic animals. Transgenic technology is also a useful tool for celllineage analysis and for transplantation experiments. Studies onpromoter function or lineage analysis generally require the expressionof a foreign reporter gene, such as the bacterial gene lacZ. Expressionof a reporter gene can allow the identification of tissues harboring atransgene. Typically, transgenic expression has been identified by insitu hybridization or by histochemistry in fixed animals. Unfortunately,the inability to easily detect transgene expression in living animalsseverely limits the utility of this technology, particularly for lineageanalysis.

An attractive paradigm for the understanding of gene expression,development, and genetics of animals, especially humans, is to studyless complex organisms, such as Escherichia coli, Drosophila, andCaenorhabditis. The hope is that understanding of these processes insimple organisms will have relevance to similar processes in mammals andhumans. The tradeoff is to accept the disadvantage that an experimentalorganism is only distantly related to humans for the advantage of easymanipulation, fast generation times, and more straightforwardinterpretation of results in the experimental organism. The disadvantageof this tradeoff can be lessened by using an organism that is as closelyrelated as possible to mammals while retaining as many of the advantagesof less complex organisms. The problem is to identify suitable organismsfor such studies, and, more importantly, to develop the tools necessaryto manipulate such organisms.

Some examples of cell determination in invertebrates have been shown tooccur in progressive waves that are regulated by sequential cascades oftranscription factors. Much less is known about such processes invertebrates. An integrated approach combining embryological, genetic andmolecular methods, such as that used to study development in Drosophila(for example, Ghysen et al., Genes & Dev 7:723-33 (1993)), wouldfacilitate the identification of the molecular mechanisms involved inspecifying neuronal fates in vertebrates, but such an approach has beenhampered by a lack of robust genetic and molecular tools for use invertebrates.

Transgenic technology has been applied to fish for various purposes. Forexample, transgenic technology has been applied to several commerciallyimportant varieties of fish, primarily in an attempt to improve theircultivation. The use of transgenic technology in fish has been reviewedby Moav, Israel J. of Zoology 40:441-466 (1994), Chen et al., ZoologicalStudies 34:215-234 (1995), and Iyengar et al., Transgenic Res. 5:147-166(1996).

Stuart et al., Development 103:403-412 (1988), describe integration offoreign DNA into zebrafish, but no expression was observed. Stuart etal., Development 109:577-584 (1990), describe expression of a transgenein zebrafish from SV40 and Rous sarcoma virus transcription regulatorysequences. Although expression was seen in a pattern of tissues, theexpression within a given tissue was variegated. Also, since Stuart etal. (1990) selected transgenics by expression and not by the presence ofthe transgene, non-expressing transgenics would have been missed bytheir analysis. Culp et al., Proc. Natl. Acad. Sci. USA 88:7953-7957(1991), describe integration and germ line transmission of DNA inzebrafish. Although the constructs used included the Rous sarcoma virusLTR or SV40 enhancer promoter linked to a lacZ gene, no expression wasobserved. Bayer and Campos-Ortega, Development 115:421-426 (1992),describe integration and expression in zebrafish of a lacZ transgenehaving a minimal promoter (a mouse heat shock promoter) but no upstreamregulatory sequences. The expression obtained depended on the site ofintegration indicating that endogenous sequences at the site ofintegration of the fish were responsible for expression. Westerfield etal., Genes & Development 6:591-598 (1992), describe transient expressionin zebrafish of β-galactosidase from mouse and human Hox gene promoters.Lin et al., Dev. Biology 161:77-83 (1994), describe transgenicexpression of lacZ in living zebrafish embryos. The transgene linked theenhancer-promoter of the Xenopus elongation factor 1α gene with the lacZcoding sequence. Different lines of transgenic fish exhibited differentpatterns of expression, indicating that the site of integration may beaffecting the pattern of expression. Amsterdam et al., Dev. Biology171:123-129 (1995), and Amsterdam et al., Gene 173:99-103 (1996),describe transgenic expression of green fluorescent protein (GFP) inzebrafish. The transgene linked the enhancer-promoter of the Xenopuselongation factor 1α gene with the GFP coding sequence. As in Lin etal., Dev. Biology 161:77-83 (1994), different lines of transgenic fishexhibited different patterns of expression, indicating that the site ofintegration may be affecting the pattern of expression. Although some ofthe systems described above exhibited patterned expression, noneresulted in the transmission of stable tissue-specific expression of atransgene in zebrafish.

It is an object of the present invention to provide transgenic fishhaving tissue- and developmentally-specific expression of transgenes.

It is another object of the present invention to provide a method ofmaking transgenic fish having tissue- and developmentally-specificexpression of transgenes.

It is another object of the present invention to provide a method ofidentifying compounds that affect expression of fish genes of interest.

It is another object of the present invention to provide a method ofidentifying the pattern of expression of fish genes of interest.

It is another object of the present invention to provide a method ofidentifying genes that affect expression of fish genes of interest.

It is another object of the present invention to provide a method ofgenetically marking mutant fish genes.

It is another object of the present invention to provide a method ofidentifying fish that have inherited a mutant gene.

It is another object of the present invention to provide a method ofidentifying enhancers and other regulatory sequences in fish.

It is another object of the present invention to provide a constructthat exhibits tissue- and developmentally-specific expression in fish.

BRIEF SUMMARY OF THE INVENTION

Disclosed are transgenic fish, and a method of making transgenic fish,which express transgenes in stable and predictable tissue- ordevelopmentally-specific patterns. The transgenic fish contain transgeneconstructs with homologous expression sequences. Also disclosed aremethods of using such transgenic fish. Such expression of transgenesallow the study of developmental processes, the relationship of celllineages, the assessment of the effect of specific genes and compoundson the development or maintenance of specific tissues or cell lineages,and the maintenance of lines of fish bearing mutant genes.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1A shows the nucleotide sequence at the exon/intron junctions ofthe zebrafish GATA-1 locus. The conserved splice sequences areunderlined and the intron sequences are listed within parentheses. Theamino acids encoded by the exon regions flanking the introns are shownbeneath the nucleotide sequence. The upstream splice junction nucleotidesequences are SEQ ID NO:6 (IVS-1), SEQ ID NO:7 (IVS-2), SEQ ID NO:8(IVS-3), and SEQ ID NO:9 (IVS-4). The downstream splice junctionnucleotide sequences are SEQ ID NO:10 (IVS-1), SEQ ID NO:11 (IVS-2), SEQID NO:12 (IVS-3), and SEQ ID NO:13 (IVS-4). The amino acid sequencesspanning the introns are SEQ ID NO:14 (IVS-1), SEQ ID NO: 15 (IVS-2),SEQ ID NO:16 (IVS-3), and SEQ ID NO:17 (IVS-4).

FIG. 1B is a diagram of the structure of the zebrafish GATA-1 locus.Exon regions are filled. Intron regions are unfilled. The tall filledboxes represent the coding regions. The arrow indicates the putativetranscription start site. EcoRI endonuclease sites are labeled E. BglIIendonuclease sites are labeled G. BamHI endonuclease sites are labeledB.

FIG. 2 is a diagram of the structures of three GATA-1/GFP transgeneconstructs used to make transgenic fish. The filled region to the rightof the GM2 box in each construct represents the 5.4 kb or 5.6 kb regionof the GATA-1 locus upstream of the GATA-1 coding region. The boxlabeled GM2 represents a sequence encoding the modified greenfluorescent protein. The thin angled lines in constructs (1) and (3)represent vector or linking sequences. EcoRI endonuclease sites arelabeled E. BglII endonuclease sites are labeled G. BamHI endonucleasesites are labeled B. In construct (3), the BamHI/EcoRI fragment on theright side is the downstream BamHI/EcoRI fragment of the GATA-1 locus.

FIG. 3 is a diagram of the structures of GATA-2/GFP transgene constructsfor analyzing the expression sequences of the GATA-2 gene. The linerepresents all or upstream deleted portions of a 7.3 kb region upstreamof the translation start site in the zebrafish GATA-2 gene. The hatchedbox represents a segment encoding the modified GFP and including a SV40polyadenylation signal. Tick marks labeled P, Sa, A, C, and Sc indicatesrestriction sites PstI, SacI, AatII, ClaI and ScaI, respectively, in the7.3 kb region.

FIG. 4 is a diagram of the structures of GATA-2/GFP transgene constructsfor analyzing the expression sequences of the GATA-2 gene. The thickopen box represents a 1116 bp fragment of the upstream region of theGATA-2 gene required for neuron-specific expression. The thin open boxrepresents segments of the upstream region of the GATA-2 gene proximalto the transcription start site. The thick line represents the minimalpromoter of the Xenopus elongation factor 1α gene. The hatched boxrepresents a segment encoding the modified GFP and including a SV40polyadenylation signal.

FIG. 5 is a graph of the percent of embryos microinjected with thetransgene constructs shown in FIG. 4 that expressed GFP in neurons.

FIG. 6 is a graph of the percent of embryos microinjected with transgeneconstructs that expressed GFP in neurons. The transgene constructs werensP5-GM2 and truncated forms of nsP5-GM2.

FIG. 7 is a graph of the percent of embryos microinjected with transgeneconstructs that expressed GFP in neurons. The transgene constructs weremutant forms of the ns3831 truncation of nsP5-GM2.

DETAILED DESCRIPTION OF THE INVENTION

Disclosed are transgenic fish, and a method of making transgenic fish,which express transgenes in stable and predictable tissue- ordevelopmentally-specific patterns. Also disclosed are methods of usingsuch transgenic fish. Such expression of transgenes allow the study ofdevelopmental processes, the relationship of cell lineages, theassessment of the effect of specific genes and compounds on thedevelopment or maintenance of specific tissues or cell lineages, and themaintenance of lines of fish bearing mutant genes. The disclosedtransgenic fish are characterized by homologous expression sequences inan exogenous construct introduced into the fish or a progenitor of thefish.

As used herein, transgenic fish refers to fish, or progeny of a fish,into which an exogenous construct has been introduced. A fish into whicha construct has been introduced includes fish which have developed fromembryonic cells into which the construct has been introduced. As usedherein, an exogenous construct is a nucleic acid that is artificiallyintroduced, or was originally artificially introduced, into an animal.The term artificial introduction is intended to exclude introduction ofa construct through normal reproduction or genetic crosses. That is, theoriginal introduction of a gene or trait into a line or strain of animalby cross breeding is intended to be excluded. However, fish produced bytransfer, through normal breeding, of an exogenous construct (that is, aconstruct that was originally artificially introduced) from a fishcontaining the construct are considered to contain an exogenousconstruct. Such fish are progeny of fish into which the exogenousconstruct has been introduced. As used herein, progeny of a fish are anyfish which are descended from the fish by sexual reproduction orcloning, and from which genetic material has been inherited. In thiscontext, cloning refers to production of a genetically identical fishfrom DNA, a cell, or cells of the fish. The fish from which another fishis descended is referred to as a progenitor fish. As used herein,development of a fish from a cell or cells (embryonic cells, forexample), or development of a cell or cells into a fish, refers to thedevelopmental process by which fertilized egg cells or embryonic cells(and their progeny) grow, divide, and differentiate to form an adultfish.

The examples illustrate the manner in which transgenic fish exhibitingcell lineage-specific expression can be made and used. The transgenicfish described in the examples, and the transgene constructs used, areparticularly useful for early detection of fish expressing thetransgene, the study of erythroid cell development, the study ofneuronal development, and as a reporter for genetically linked mutantgenes.

Tissue-, developmental stage-, or cell lineage-specific expression of areporter gene from a regulated promoter in the disclosed transgenic fishcan be useful for identifying the pattern of expression of the gene fromwhich the promoter is derived. Such expression can also allow study ofthe pattern of development of a cell lineage. As used herein,tissue-specific expression refers to expression substantially limited tospecific tissue types. Tissue-specific expression is not necessarilylimited to expression in a single tissue but includes expression limitedto one or more specific tissues. As used herein, developmentalstage-specific expression refers to expression substantially limited tospecific developmental stages. Developmental stage-specific expressionis not necessarily limited to expression at a single developmental stagebut includes expression limited to one or more specific developmentalstage. As used herein, cell lineage-specific expression refers toexpression substantially limited to specific cell lineages. As usedherein, cell lineage refers to a group of cells that are descended froma particular cell or group of cells. In development, for example, newlyspecialized or differentiated cells can give rise to cell lineages. Celllineage-specific expression is not necessarily limited to expression ina single cell lineage but includes expression limited to one or morespecific cell lineages. All of these types of specific expression canoperate in the same gene. For example, a developmentally regulated genecan be expressed at both specific developmental stages and be limited tospecific tissues. As used herein, the pattern of expression of a generefers to the tissues, developmental stages, cell lineages, orcombinations of these in or at which the gene is expressed.

1. Transgene Constructs

Transgene constructs are the genetic material that is introduced intofish to produce a transgenic fish. Such constructs are artificiallyintroduced into fish. The manner of introduction, and, often, thestructure of a transgene construct, render such a transgene construct anexogenous construct. Although a transgene construct can be made up ofany nucleic acid sequences, for use in the disclosed transgenic fish itis preferred that the transgene constructs combine expression sequencesoperably linked to a sequence encoding an expression product. Thetransgenic construct will also preferably include other components thataid expression, stability or integration of the construct into thegenome of a fish. As used herein, components of a transgene constructreferred to as being operably linked or operatively linked refer tocomponents being so connected as to allow them to function together fortheir intended purpose. For example, a promoter and a coding region areoperably linked if the promoter can function to result in transcriptionof the coding region.

A. Expression Sequences

Expression sequences are used in the disclosed transgene constructs tomediate expression of an expression product encoded by the construct. Asused herein, expression sequences include promoters, upstream elements,enhancers, and response elements. It is preferred that the expressionsequences used in the disclosed constructs be homologous expressionsequences. As used herein, in reference to components of transgeneconstructs used in the disclosed transgenic fish, homologous indicatesthat the component is native to or derived from the species or type offish involved. Conversely, heterologous indicates that the component isneither native to nor derived from the species or type of fish involved.

Two large scale chemical mutagenesis screens recently produced thousandsof zebrafish mutants affecting development (Driever et al., Development123:37-46 (1996); Haffter et al., Development 123:1-36 (1996)). Suchgenes and their expression patterns are of significant interest forunderstanding the developmental process. Therefore, expression sequencesfrom these genes are preferred for use as expression sequences in thedisclosed constructs.

As used herein, expression sequences are divided into two main classes,promoters and enhancers. A promoter is generally a sequence or sequencesof DNA that function when in a relatively fixed location in regard tothe transcription start site. A promoter contains core elements requiredfor basic interaction of RNA polymerase and transcription factors, andmay contain upstream elements and response elements. Enhancer generallyrefers to a sequence of DNA that functions at no fixed distance from thetranscription start site and can be in either orientation. Enhancersfunction to increase transcription from nearby promoters. Enhancers alsooften contain response elements that mediate the regulation oftranscription. Promoters can also contain response elements that mediatethe regulation of transcription.

Enhancers often determine the regulation of expression of a gene. Thiseffect has been seen in so-called enhancer trap constructs whereintroduction of a construct containing a reporter gene operably linkedto a promoter is expressed only when the construct inserts into thedomain of an enhancer (O'Kane and Gehring, Proc. Natl. Acad. Sci. USA84:9123-9127 (1987), Allen et al., Nature 333:852-855 (1988), Kothary etal., Nature 335:435-437 (1988), Gossler et al., Science 244:463-465(1989)). In such cases, the expression of the construct is regulatedaccording to the pattern of the newly associated enhancer. Transgenicconstructs having only a minimal promoter can be used in the disclosedtransgenic fish to identify enhancers.

Preferred enhancers for use in the disclosed transgenic fish are thosethat mediate tissue- or cell lineage-specific expression. More preferredare homologous enhancers that mediate tissue- or cell lineage-specificexpression. Still more preferred are enhancers from fish GATA-1 andGATA-2 genes. Most preferred are enhancers from zebrafish GATA-1 andGATA-2 genes.

For expression of encoded peptides or proteins, a transgene constructalso needs sequences that, when transcribed into RNA, mediatetranslation of the encoded expression products. Such sequences aregenerally found in the 5′ untranslated region of transcribed RNA. Thisregion corresponds to the region on the construct between thetranscription initiation site and the translation initiation site (thatis, the initiation codon). The 5′ untranslated region of a construct canbe derived from the 5′ untranslated region normally associated with thepromoter used in the construct, the 5′ untranslated region normallyassociated with the sequence encoding the expression product, the 5′untranslated region of a gene unrelated to the promoter or sequenceencoding the expression product, or a hybrid of these 5′ untranslatedregions. Preferably, the 5′ untranslated region is homologous to thefish into which the construct is to be introduced. Preferred 5′untranslated regions are those normally associated with the promoterused.

B. Expression Products

Transgene constructs for use in the disclosed transgenic fish can encodeany desired expression product, including peptides, proteins, and RNA.Expression products can include reporter proteins (for detection andquantitation of expression), and products having a biological effect oncells in which they are expressed (by, for example, adding a newenzymatic activity to the cell, or preventing expression of a gene).Many such expression products are known or can be identified.

Reporter Proteins

As used herein, a reporter protein is any protein that can bespecifically detected when expressed. Reporter proteins are useful fordetecting or quantitating expression from expression sequences. Forexample, operatively linking nucleotide sequence encoding a reporterprotein to a tissue specific expression sequences allows one tocarefully study lineage development. In such studies, the reporterprotein serves as a marker for monitoring developmental processes, suchas cell migration. Many reporter proteins are known and have been usedfor similar purposes in other organisms. These include enzymes, such asβ-galactosidase, luciferase, and alkaline phosphatase, that can producespecific detectable products, and proteins that can be directlydetected. Virtually any protein can be directly detected by using, forexample, specific antibodies to the protein. A preferred reporterprotein that can be directly detected is the green fluorescent protein(GFP). GFP, from the jellyfish Aequorea victoria, produces fluorescenceupon exposure to ultraviolet light without the addition of a substrate(Chalfie et al., Science 263:802-5 (1994)). Recently, a number ofmodified GFPs have been created that generate as much as 50-fold greaterfluorescence than does wild type GFP under standard conditions (Cormacket al., Gene 173:33-8 (1996); Zolotukhin et al., J. Virol 70:4646-54(1996)). This level of fluorescence allows the detection of low levelsof tissue specific expression in a living transgenic animal.

The use of reporter proteins that, like GFP, are directly detectablewithout requiring the addition of exogenous factors are preferred fordetecting or assessing gene expression during zebrafish embryonicdevelopment. A transgenic zebrafish embryo, carrying a constructencoding a reporter protein and a tissue-specific expression sequences,can provide a rapid real time in vivo system for analyzing spatial andtemporal expression patterns of developmentally regulated genes.

C. Other Construct Sequences

The disclosed transgene constructs preferably include other sequenceswhich improve expression from, or stability of, the construct. Forexample, including a polyadenylation signal on the constructs encoding aprotein ensures that transcripts from the transgene will be processedand transported as mRNA. The identification and use of polyadenylationsignals in expression constructs is well established. It is preferredthat homologous polyadenylation signals be used in the transgeneconstructs.

It is also known that the presence of introns in primary transcripts canincrease expression, possibly by causing the transcript to enter theprocessing and transport system for mRNA. It is preferred that anintron, if used, be included in the 5′ untranslated region or the 3′untranslated region of the transgene transcript. It is also preferredthat the intron be homologous to the fish used, and more preferablyhomologous to the expression sequences used (that is, that the intron befrom the same gene that some or all of the expression sequences arefrom). The use and importance of these and other components useful fortransgene constructs are discussed in Palmiter et al., Proc. Natl. Acad.Sci. USA 88:478-482 (1991); Sippel et al., “The Regulatory DomainOrganization of Eukaryotic Genomes: Implications For Stable GeneTransfer” in Transgenic Animals (Grosveld and Kollias, eds., AcademicPress, 1992), pages 1-26; Kollias and Grosveld, “The Study of GeneRegulation in Transgenic Mice” in Transgenic Animals (Grosveld andKollias, eds, Academic Press, 1992), pages 79-98; and Clark et al.,Phil. Trans. R. Soc. Lond. B. 339:225-232 (1993).

The disclosed constructs are preferably integrated into the genome ofthe fish. However, the disclosed transgene construct can also beconstructed as an artificial chromosome. Such artificial chromosomescontaining more that 200 kb have been used in several organisms.Artificial chromosomes can be used to introduce very large transgeneconstructs into fish. This technology is useful since it can allowfaithful recapitulation of the expression pattern of genes that haveregulatory elements that lie many kilobases from coding sequences.

2. Fish

The disclosed constructs and methods can be used with any type of fish.As used herein, fish refers to any member of the classes collectivelyreferred to as pisces. It is preferred that fish belonging to speciesand varieties of fish of commercial or scientific interest be used. Suchfish include salmon, trout, tuna, halibut, catfish, zebrafish, medaka,carp, tilapia, goldfish, and loach.

The most preferred fish for use with the disclosed constructs andmethods is zebrafish, Danio rerio. Zebrafish are an increasingly popularexperimental animal since they have many of the advantages of popularinvertebrate experimental organisms, and include the additionaladvantage that they are vertebrates. Another significant advantage ofzebrafish for the study of development and cell lineages is that, likeCaenorhabditis, they are largely transparent (Kimmel, Trends Genet5:283-8 (1989)). The generation of thousands of zebrafish mutants(Driever et al., Development 123:37-46 (1996); Haffter et al.,Development 123:1-36 (1996)) provides abundant raw material fortransgenic study of these animals. General zebrafish care andmaintenance is described by Streisinger, Natl. Cancer Inst. Monogr.65:53-58 (1984).

Zebrafish embryos are easily accessible and nearly transparent. Giventhese characteristics, a transgenic zebrafish embryo, carrying aconstruct encoding a reporter protein and tissue-specific expressionsequences, can provide a rapid real time in vivo system for analyzingspatial and temporal expression patterns of developmentally regulatedgenes. In addition, embryonic development of the zebrafish is extremelyrapid. In 24 hours an embryo develops rudiments of all the major organs,including a functional heart and circulating blood cells (Kimmel, TrendsGenet 5:283-8 (1989)). Other fish with some or all of the same desirablecharacteristics are also preferred.

3. Production of Transgenic Fish

The disclosed transgenic fish are produced by introducing a transgeneconstruct into cells of a fish, preferably embryonic cells, and mostpreferably in a single cell embryo. Where the transgene construct isintroduced into embryonic cells, the transgenic fish is obtained byallowing the embryonic cell or cells to develop into a fish.Introduction of constructs into embryonic cells of fish, and subsequentdevelopment of the fish, are simplified by the fact that embryos developoutside of the parent fish in most fish species.

The disclosed transgene constructs can be introduced into embryonic fishcells using any suitable technique. Many techniques for suchintroduction of exogenous genetic material have been demonstrated infish and other animals. These include microinjection (described by, forexample, Culp et al. (1991)), electroporation (described by, forexample, Inoue et al., Cell. Differ. Develop. 29:123-128 (1990); Mülleret al., FEBS Lett. 324:27-32 (1993); Murakami et al., J. Biotechnol.34:35-42 (1994); Müller et al., Mol. Mar. Biol. Biotechnol. 1:276-281(1992); and Symonds et al., Aquaculture 119:313-327 (1994)), particlegun bombardment (Zelenin et al., FEBS Lett. 287:118-120 (1991)), and theuse of liposomes (Szelei et al., Transgenic Res. 3:116-119 (1994)).Microinjection is preferred. The preferred method for introduction oftransgene constructs into fish embryonic cells by microinjection isdescribed in the examples.

Embryos or embryonic cells can generally be obtained by collecting eggsimmediately after they are laid. Depending on the type of fish, it isgenerally preferred that the eggs be fertilized prior to or at the timeof collection. This is preferably accomplished by placing a male andfemale fish together in a tank that allows egg collection underconditions that stimulate mating. After collecting eggs, it is preferredthat the embryo be exposed for introduction of genetic material byremoving the chorion. This can be done manually or, preferably, by usinga protease such as pronase. A preferred technique for collectingzebrafish eggs and preparing them for microinjection is described in theexamples. A fertilized egg cell prior to the first cell division isconsidered a one cell embryo, and the fertilized egg cell is thusconsidered an embryonic cell.

After introduction of the transgene construct the embryo is allowed todevelop into a fish. This generally need involve no more than incubatingthe embryos under the same conditions used for incubation of eggs.However, the embryonic cells can also be incubated briefly in anisotonic buffer. If appropriate, expression of an introduced transgeneconstruct can be observed during development of the embryo.

Fish harboring a transgene can be identified by any suitable means. Forexample, the genome of potential transgenic fish can be probed for thepresence of construct sequences. To identify transgenic fish actuallyexpressing the transgene, the presence of an expression product can beassayed. Several techniques for such identification are known and usedfor transgenic animals and most can be applied to transgenic fish.Probing of potential or actual transgenic fish for nucleic acidsequences present in or characteristic of a transgene construct ispreferably accomplished by Southern or Northern blotting. Also preferredis detection using polymerase chain reaction (PCR) or othersequence-specific nucleic acid amplification techniques. Preferredtechniques for identifying transgenic zebrafish are described in theexamples.

4. Identifying the Pattern of Expression of Fish Genes

Identifying the pattern of expression in the disclosed transgenic fishcan be accomplished by measuring or identifying expression of thetransgene in different tissues (tissue-specific expression), atdifferent times during development (developmentally regulated expressionor developmental stage-specific expression), in different cell lineages(cell lineage-specific expression). These assessments can also becombined by, for example, measuring expression (and observing changes,if any) in a cell lineage during development. The nature of theexpression product to be detected can have an effect on the suitabilityof some of these analyses. On one level, different tissues of a fish canbe dissected and expression can be assayed in the separate tissuesamples. Such an assessment can be performed when using almost anyexpression product. This technique is commonly used in transgenicanimals and is useful for assessing tissue-specific expression.

This technique can also be used to assess expression during the courseof development by assaying for the expression product at differentdevelopmental stages. Where detection of the expression product requiresfixing of the sample or other treatments that destroy or kill thedeveloping embryo or fish, multiple embryos must be used. This is onlypractical where the expression pattern in different embryos is expectedto be the same or similar. This will be the case when using thedisclosed transgenic fish having stable and predictable expression.

A more preferred way of assessing the pattern of expression of atransgene during development is to use an expression product that can bedetected in living embryos and animals. A preferred expression productfor this purpose is the green fluorescent protein. A preferred form ofGFP and a preferred technique for measuring the presence of GFP inliving fish is described in the examples.

Expression products of the disclosed transgene constructs can bedetected using any appropriate method. Many means of detectingexpression products are known and can be applied to the detection ofexpression products in transgenic fish. For example, RNA can be detectedusing any of numerous nucleic acid detection techniques. Some of thesedetection methods as applied to transgenic fish are described in theexamples. The use of reporter proteins as the expression product ispreferred since such proteins are selected based on their detectability.The detection of several useful reporter proteins is described byIyengar et al. (1996).

In zebrafish, the nervous system and other organ rudiments appear within24 hours of fertilization. Since the nearly transparent zebrafish embryodevelops outside its mother, the origin and migration of lineageprogenitor cells can be monitored by following expression of anexpression product in transgenic fish. In addition, the regulation of aspecific gene can be studied in these fish.

Using zebrafish promoters that drive expression in specific tissues, anumber of transgenic zebrafish lines can be generated that express areporter protein in each of the major tissues including the notochord,the nervous system, the brain, the thymus, and in other tissues (seeTable 1). Other important lineages for which specific expression can beobtained include neutral crest, germ cells, liver, gut, and kidney.Additional tissue specific transgenic fish can be generated by using“enhancer trap” constructs to identify expression sequences in fish.

TABLE 1 Source of Expression Sequences Tissues/Cell lineages GATA-1Erythroid progenitor GATA-2 Hematopoietic stem cells/CNS Tinman HeartRag-1 T and B Cells Globin Mature red blood cells MEF Muscle progenitorsGoosecoid Dorsal organizer SCL-1 Hematopoietic stem cells Rbtn-2Hematopoietic stem cells No-tail Notochord Flk-1 Vascular endotheliaEve-1 Ventral/posterior cells Ikaros Early lymphoid progenitors Pdx-1Pancreas Islet-1 Motoneuron Shh Multi-tissue induction/Left-rightsymmetry Twist Axial mesoderm/Left-right symmetry Krox20 Brain BMP4Ventral mesoderm induction5. Identifying Compounds that Affect Expression of Fish Genes

For many genes, and especially for genes involved in developmentalprocesses, it would be useful to identify compounds that affectexpression of the genes. The disclosed transgenic fish can be exposed tocompounds to assess the effect of the compound on the expression of agene of interest. For example, test compounds can be administered totransgenic fish harboring an exogenous construct containing theexpression sequences of a fish gene of interest operably linked to asequence encoding a reporter protein. By comparing the expression of thereporter protein in fish exposed to a test compound to those that arenot exposed, the effect of the compound on the expression of the genefrom which the expression sequences are derived can be assessed.

6. Identifying Genes that Affect Expression of Fish Genes

Numerous mutants have been generated and characterized in zebrafishwhich affect most developmental processes. The disclosed transgenic fishcan be used in combination with these and other mutations to assess theeffect of a mutant gene on the expression of a gene of interest. Forexample, mutations can be introduced into strains of transgenic fishharboring an exogenous construct containing the expression sequences ofa fish gene of interest operably linked to a sequence encoding areporter protein. By comparing the expression of the reporter protein infish with a mutation to those without the mutation, the effect of themutation on the expression of the gene from which the expressionsequences are derived can be assessed.

The effect of such mutations on specific developmental processes and onthe growth and development of specific cell lineages can also beassessed using the disclosed transgenic fish expressing a reporterprotein in specific cell lineages or at specific developmental stages.

7. Genetically Marking Mutant Fish Genes

The disclosed transgene constructs can be used to genetically markmutant genes or chromosome regions. For example, in zebrafish, recentchemical mutagenesis screens have generated more than one thousanddifferent mutants with defects in most developmental processes. If fishcarrying a mutation generated in these screens could be more easilyidentified, a lot of time and labor would be saved. One way to promoterapid identification of fish carrying mutations would be theestablishment of balancer chromosomes that carry markers that can beeasily identified in living fish. This technology has greatlyfacilitated the task of identification and maintenance of mutant stocksin Drosophila (Ashburner, Drosophila, A Laboratory Manual (Cold SpringHarbor Laboratory Press, Cold Spring Harbor, N.Y., 1989); Lindsey andZimm, The Genome of Drosophila melanogaster (Academic Press, San Diego,Calif., 1995)). As used herein, genetically marking a gene or chromosomeregion refers to genetically linking a reporter gene to the gene orchromosome region. Genetic linkage between two genetic elements (such asgenes) refers to the elements being in sufficiently close proximity on achromosome that they do not segregate from each other at random ingenetic crosses. The closer the genetic linkage, the more likely thatthe two elements will segregate together. For genetic marking, it ispreferred that the transgene construct segregate with the gene orchromosomal region of interest more than 60% of the time, it is morepreferred that the transgene construct segregate with the gene orchromosomal region of interest more than 70% of the time, it is stillmore preferred that the transgene construct segregate with the gene orchromosomal region of interest more than 80% of the time, it is stillmore preferred that the transgene construct segregate with the gene orchromosomal region of interest more than 90% of the time, and it is mostpreferred that the transgene construct segregate with the gene orchromosomal region of interest more than 95% of the time.

Example 1 shows that living transgenic fish carrying insertions of atransgene, in which the zebrafish GATA-1 promoter has been ligated tothe green fluorescent protein (GFP) reporter gene, can be identified bysimple observation of GFP expression in blood cells. As in Drosophila,zebrafish chromosomal recombination occurs at a significantly lower rateduring spermatogenesis than it does during oogenesis. Therefore, atransgene insertion that maps near a chemically induced mutant gene canbe crossed into the mutant chromosome through oogenesis and will thenremain linked to the mutation in male fish through many generations.This procedure will allow the identification of progeny harboring themutant gene by simple observation of GFP in blood cells.

In the case of zebrafish, 200 lines carrying the GATA-1/GFP transgene(or another reporter construct), randomly inserted throughout thezebrafish genome should result in an average of 8 insertions in each ofthe 25 zebrafish chromosomes. This is possible since expression from thedisclosed constructs is not limited by effects of the site of insertionand the site of integration is not limited. The insertion sites can bemapped and then crossed through oogenesis into zebrafish lines thatcarry a mutation that maps nearby. Once established, mutant strains thatcarry balancer chromosomes can be maintained in male fish.

Although it is preferred that mutant genes be genetically marked, anygene of interest or any chromosome region can be marked, and themaintenance and inheritance of the gene can be monitored, in a similarmanner. As used herein, an identified mutant gene is a mutant gene thatis known or that has been identified, in contrast to a mutant gene whichmay be present in an organism but which has not been recognized.

Genetically mapping of mutant genes or transgenes in fish can beperformed using established techniques and the principles of geneticcrosses. Generally, mapping involves determining the linkagerelationships between genetic elements by assessing whether, and to whatextent two or more genetic elements tend to cosegregate in geneticcrosses.

8. Identifying Fish that have Inherited a Mutant Gene

Mutant fish in which the mutant gene is marked with an exogenousconstruct expressing a reporter protein simplify the identification ofprogeny fish that carry the mutant gene. For example, after a cross,progeny fish can be screened for expression of the reporter protein.Those that express the reporter protein are very likely to haveinherited the mutant gene which is genetically linked. Those progenyfish not expressing the reporter protein can be excluded from furtheranalysis.

Although recombination during gametogenesis may result in segregation ofthe exogenous construct from the mutant gene, this will happen onlyrarely. Initial screening for fish expressing the reporter protein willstill ensure that the majority of such progeny fish will carry themutant gene. Confirmation of the mutant can be established by subsequentdirect testing for the mutant gene.

9. Identifying and Cloning Regulatory Sequences from Fish

The disclosed constructs can also be used as “enhancer traps” togenerate transgenic fish that exhibit tissue-specific expression of anexpression product. Transgenic animals carrying enhancer trap constructsoften exhibit tissue-specific expression patterns due to the effects ofendogenous enhancer elements that lie near the position of integration.

Once it is determined that the exogenous construct is operably linked toan enhancer or other regulatory sequence in a fish, the regulatoryelement can be isolated by re-cloning the transgene construct. Manygeneral cloning techniques can be used for this purpose. A preferredmethod of cloning regulatory sequences that have become linked to atransgene construct in a fish is to isolate and cleave genomic DNA fromthe fish with a restriction enzyme that does not cleave the exogenousconstruct. The resulting fragments can be cloned in vitro and screenedfor the presence of characteristic transgene sequences. A search forenhancers in zebrafish using a transgene construct having only apromoter operably linked to a sequence encoding a reporter protein hasgenerated a transgenic line that expresses GFP exclusively in hatchinggland cells.

A similar procedure can be followed to identify promoters. In this case,a “promoter probe” construct, which lacks any expression sequences, isused. Only if the construct is inserted into the genome downstream ofexpression sequences will the expression product encoded by theconstruct be expressed.

10. Identifying Promoters and Enhancers in Cloned Expression Sequences

The linked genomic sequences of clones identified as containingexpression sequences, or any other nucleic acid segment containingexpression sequences, can then be characterized to identify potentialand actual regulatory sequences. For example, a deletion series of apositive clone can be tested for expression in transgenic fish.Sequences essential for expression, or for a pattern of expression, areidentified as those which, when deleted from a construct, no longersupport expression or the pattern of expression. The ability to assessthe pattern of expression of a transgene in fish using the disclosedtransgenic fish and methods makes it possible to identify the elementsin the regulatory sequences of a fish gene that are responsible for thepattern of expression. The disclosed transgenic fish, since they can beproduced routinely and consistently, allow meaningful comparison of theexpression of different deletion constructs in separate fish.

An example of the power of this capability is described in Example 2.Application of this system to the study of the GATA-2 promoter has ledto identification of enhancer regions that facilitate gene expressionspecifically in hematopoietic precursors, the enveloping layer (EVL) andthe central nervous system (CNS). Through site-directed mutagenesis, ithas been discovered that the DNA sequence CCCTCCT is essential for theneuron-specific activity of the GATA-2 promoter. This is described inExample 2.

11. Isolating Cells Expressing an Expression Product

Using cell sorting based on the presence of an expression product, purepopulations of cells expressing a transgene construct can be isolatedfrom other cells. Where the transgene construct is expressed inparticular cell lineages or tissues, this can allow the purification ofcells from that particular lineage. These cells can be used in a varietyof in vitro studies. For instance, these pure cell populations canprovide mRNA for differential display or subtractive screens foridentifying genes expressed in that cell lineage. Progenitor cells ofspecific tissue could also be isolated. Establishing such cells intissue culture would allow the growth factor needs of these cells to bedetermined. Such knowledge could be used to culture non-transgenic formsof the same cells or related cells in other organisms.

Cell sorting is preferably facilitated by using a construct expressing afluorescent protein or an enzyme producing a fluorescent product. Thisallows fluorescence activated cell sorting (FACS). A preferredfluorescent protein for this purpose is the green fluorescent protein.The ability to generate transgenic fish expressing GFP in a tissue- andcell lineage-specific manner for different cell types indicates thattransgenic fish that express GFP in other types of tissues can begenerated in a straightforward manner. The disclosed FACS approach cantherefore be used as a general method for isolating pure cellpopulations from developing embryos based solely on gene expressionpatterns. This method for isolation of specific cell lineages ispreferably performed using constructs linking GFP with the expressionsequences of genes identified as being involved in development. Numeroussuch genes have been or can be identified as mutants that affectdevelopment. Cells isolated in this manner should be useful intransplantation experiments.

Publications cited herein and the material for which they are cited arespecifically incorporated by reference.

Those skilled in the art will recognize, or be able to ascertain usingno more than routine experimentation, many equivalents to the specificembodiments of the invention described herein. Such equivalents areintended to be encompassed by the following claims.

Examples Example 1 Tissue-Specific Expression and Germline Transmissionof a Transgene in Zebrafish

In this example, DNA constructs containing the putative zebrafishexpression sequences of GATA-1, an erythroid-specific transcriptionfactor, operatively linked to a sequence encoding the green fluorescentprotein (GFP), were microinjected into single-cell zebrafish embryos.

GATA-1, an early marker of the erythroid lineage, was initiallyidentified through its effects upon globin gene expression (Evans andFelsenfeld, Cell 58:877-85 (1989); Tsai et al., Nature 339:446-51(1989)). Since then GATA-1 has been shown to be a member of a multigenefamily. Members of this gene family encode transcription factors thatrecognize the DNA core consensus sequence, WGATAR (SEQ ID NO:18). GATAfactors are key regulators of many important developmental processes invertebrates, particularly hematopoiesis (Orkin, Blood 80:575-81 (1992)).The importance of GATA-1 for hematopoiesis was definitively demonstratedin null mutations in mouse (Pevny et al., Nature 349:257-60 (1991)). Inchimeric mice, embryonic stem cells carrying a null mutation in GATA-1,created via homologous recombination, contributed to allnon-hematopoietic tissues tested and to a white blood cell fraction, butfailed to give rise to mature red blood cells.

In zebrafish, GATA-1 expression is restricted to erythroid progenitorcells that initially occupy a ventral extra-embryonic position, similarto the situation found in other vertebrates (Detrich et al., Proc NatlAcad Sci USA 92:10713-7 (1995)). As development proceeds, these cellsenter the zebrafish embryo and form a distinct structure known as thehematopoietic intermediate cell mass (ICM).

Vertebrate hematopoiesis is a complex process that proceeds in distinctphases, at various anatomic sites, during development (Zon, Blood86:2876-91 (1995)). Although studies on in vitro model systems havegenerated some insight into hematopoietic development (Cumano et al.,Cell 86:907-16 (1996); Kennedy et al., Nature 386:488-493 (1997);Medvinsky and Dzierzak, Cell 86:897-906 (1996); Nakano et al., Science272:722-4 (1996)), the origin of hematopoietic progenitor cells duringvertebrate embryogenesis is still controversial. Therefore, an in vivomodel should be useful to determine precisely the cellular and molecularmechanisms involved in hematopoietic development. Such a model couldalso be used to identify compounds and genes that affect hematopoiesis.In mammals, since embryogenesis occurs internally, it is difficult tocarefully observe hematopoietic processes.

Zebrafish have a number of features that facilitate the study ofvertebrate hematopoiesis. Because development is external and embryosare nearly transparent, the migration of labeled hematopoietic cells canbe easily monitored. In addition, many mutants that are defective inhematopoietic development have been generated (Ransom et al.,Development 123:311-319 (1996); Weinstein et al., Development123:303-309 (1996)). Zebrafish embryos that significantly lackcirculating blood can survive for several days, so downstream effects ofmutations upon gene expression deleterious to embryonic hematopoieticdevelopment can be characterized. Since the cellular processes andmolecular regulation of hematopoiesis are generally conserved throughoutvertebrate evolution, results from zebrafish embryonic studies can alsoprovide insight into the mechanisms involved in mammalian hematopoiesis.

Cloning and Sequencing of GATA-1 Genomic DNA

A zebrafish genomic phage library was screened with a ³²P radiolabeledprobe containing a region of zebrafish GATA-2 cDNA that encodes aconserved zinc finger. A number of positive clones were identified. Theinserts in these clones were cut with various restriction enzymes. Theresulting fragments were subcloned into pBluescript II KS(−) andsequenced. Based on DNA sequence analysis, two phage clones were shownto contain zebrafish GATA-1 sequences. The cDNA sequence of zebrafishGATA-1 is described by Detrich et al., Proc. Natl. Acad. Sci. USA92:10713 (1995). Nucleotide sequence of the GATA-1 promoter region isshown in SEQ ID NO:26.

Plasmid Constructs

Construct G1-(Bgl)-GM2 was generated by ligating a modified GFP reportergene (GM2) to a 5.4 kb EcoRI/BglII fragment that contains putativezebrafish GATA-1 expression sequences, that is, the 5′ flankingsequences upstream of the major GATA-1 transcription start site. GM2contains 5′ wild type GFP and a 3′ NcoI/EcoRI fragment derived from aGFP variant, m2, that emits approximately 30 fold greater fluorescencethan does the wild type GFP under standard FITC conditions (Cormack etal., Gene 173:33-8 (1996)). This construct is illustrated as construct(1) in FIG. 2.

To isolate expression sequences in the 5′ untranslated region of GATA-1,a 5.6 kb DNA fragment was amplified by the polymerase chain reaction(PCR) from a GATA-1 genomic subclone using a T7 primer which iscomplementary to the vector sequence, and a specific primer, Oligo (1),that is complementary to the cDNA sequence just 5′ of the GATA-1translation start. The GATA-1 specific primer contained a BamHI site tofacilitate subsequent cloning. The PCR reaction was performed usingExpand™ Long Template PCR System (Boehringer Mannheim) for 30 cycles(94° C., 30 seconds; 60° C., 30 seconds; 68° C., 5 minutes). Afterdigestion with BamHI and XhoI, this 5.6 kb DNA fragment was gel purifiedand ligated to DNA encoding the modified GFP, resulting in constructG1-GM2 (construct (2) in FIG. 2). The construct G1-(5/3)-GM2 wasgenerated by ligating an additional 4 kb of GATA-1 genomic sequences,which contains GATA-1 intron and exon sequences, to the 3′ end(following the polyadenylation signal) of the reporter gene in constructG1-GM2. This construct is illustrated as construct (3) in FIG. 2.

Fish and Microinjection

Wild type zebrafish embryos were used for all microinjections. Thezebrafish were originally obtained from pet shops (Culp et al., ProcNatl Acad Sci USA 88:7953-7 (1991)). Fish were maintained on reverseosmosis-purified water to which Instant Ocean (Aquarium Systems, Mentor,Ohio) was added (50 mg/l). Plasmid DNA G1-GM2 was linearized usingrestriction enzyme AatII (which cuts in the vector backbone), whileplasmid DNA G1-(5/3)-GM2 was excised from the vector by digestion withrestriction enzyme SacI, and separated using a low melting agarose gel.DNA fragments were cleaned using GENECLEAN II Kit (Bio101 Inc.) andresuspended in 5 mM Tris, 0.5 mM EDTA, 0.1 M KCl at a finalconcentration of 50 μg/ml prior to microinjection. Single cell embryoswere prepared and injected as described by Culp et al., Proc Natl AcadSci USA 88:7953-7 (1991), except that tetramethyl-rhodamine dextran wasincluded as an injection control. This involved collecting newlyfertilized eggs, dechorionating the eggs with pronase (used at 0.5mg/ml), and injecting DNA. Injection with each construct was doneindependently 5 to 10 times and the data obtained were pooled.

Fluorescent Microscopic Observation and Imaging

Embryos and adult fish were anesthetized using tricaine (Sigma A-5040)as described previously (Westerfield, The Zebrafish Book (University ofOregon Press, 1995)) and examined under a FITC filter on a Zeissmicroscope equipped with a video camera. Images of circulating bloodcells were produced by printing out individual frames of recordedvideos. Other pictures of fluorescent embryos were generated bysuperimposing a bright field image on a fluorescent image using AdobePhotoshop software. One month old fish were anesthetized and thenrapidly embedded in OCT. Sections of 60 μm were cut using a cryostat andwere immediately observed by fluorescence microscopy.

Identification of Germline Transgenic Fish by PCR

DNA isolation, internal control primers and PCR conditions were the sameas described by Lin et al. Dev Biol 161:77-83 (1994)). Briefly, DNA wasextracted from pools of 40 to several hundred dechorionated embryos(obtained from mating a single pair of fish) at 16 to 24 hours ofdevelopment by vortexing for 1 minute in a buffer containing 4 Mguanidium isothiocyanate, 0.25 mM sodium citrate (pH 7.0), and 0.5%Sarkosyl, 0.1 M β-mercaptoethanol. The sample was extracted once withphenol:chloroform:isoamyl alcohol (25:24:1) and total nucleic acid wasprecipitated by the addition of 3 volumes of ethanol and 1/10 volumesodium acetate (3 M, pH 5.5). The pellet was washed once in 70% ethanoland dissolved in 1× TE (pH 8.0).

Approximately 0.5 μg of DNA was used in a PCR reaction containing 20 mMTris (pH 8.3), 1.5 mM MgCl₂, 25 mM KCl, 100 μg/ml gelatin, 20 pmole eachPCR primer, 50 μM each dNTPs, 2.5 U Taq DNA polymerase (Pharmacia). Thereaction was carried out at 94° C. for 2.5 minutes for 30 cycles with a5 minute initial 94° C. denaturation step, and a 7 minute final 72° C.elongation step. Specific primers, Oligos (2) and (3), that were used todetect GFP, generated a 267 bp product. A pair of internal controlprimers homologous to sequences of the zebrafish homeobox gene, ZF-21(Njolstad et al., FEBS Letters 230:25-30 (1988)), was included in eachreaction. This pair of primers should generate a PCR product of 475 bpfor all PCR reactions using zebrafish DNA.

Preparation of Embryonic Cells and Flow Cytometry

Embryos were disrupted in Holfereter's solution using a 1.5 ml pelletpestle (Kontes Glass, OEM749521-1590). Cells were collected bycentrifugation (400 g, 5 minutes). After digestion with 1× Trypsin/EDTAfor 15 minutes at 32° C., the cells were washed twice with phosphatebuffered saline (PBS) and filtered through a 40 micron nylon mesh.Fluorescence activated cell sorting (FACS) was performed under standardFITC conditions.

cDNA Synthesis and PCR

Total RNA was extracted from FACS purified cells using the RNA isolationkit, TRIZoL (Bio101). Reverse transcription and PCR (RT-PCR) wereperformed using the Access RT-PCR System from Promega (Catalog #A1250).Specific primers, Oligos (4) and (5), used to detect the zebrafishGATA-1 cDNA, generated a 410 bp product.

Oligonucleotides

(1) 5′-CCGGATCCTGCAAGTGTAGTATTGAA-3′ (GATA-1, promoter antisense; SEQ IDNO: 1); (2) 5′-AATGTATCAATCATGGCAGAC-3′ (GM2 sense; SEQ ID NO: 2); (3)5′-TGTATAGTTCATCCATGCCATGTG-3′ (GM2 antisense; SEQ ID NO: 3); (4)5′-ATGAACCTTTCTACTCAAGCT-3′ (GATA-1, cDNA sense; SEQ ID NO: 4) (5)5′-GCTGCTTCCACTTCCACTCAT-3′ (GATA-1, cDNA antisense; SEQ ID NO: 5)

Whole-Mount RNA in Situ Hybridization

Sense and antisense digoxigenin-labeled RNA probes were generated from aGATA-1 genomic subclone containing the second and third exon codingsequence using a DIG/Genius™ 4 RNA Labeling Kit (SP6/T7) (BoehingerMannheim). RNA in situ hybridizations were performed as described(Westerfield, The Zebrafish Book (University of Oregon Press, 1995)).

Genomic Structure of the Zebrafish GATA-1

Two clones containing zebrafish GATA-1 sequences were isolated from alambda phage zebrafish genomic library as described above. Restrictionenzyme mapping indicated that the two overlapping clones containedapproximately 35 kb of the GATA-1 locus. To define the promoter of thezebrafish GATA-1 gene, transcription initiation sites for the zebrafishGATA-1 were mapped by primer extension. As in chicken, mouse, human andother species, multiple transcription initiation sites were identified.A major transcription initiation site was mapped 187 bases upstream ofthe translation start.

Comparison of the GATA-1 genomic structure for human, mouse and chickensuggested that the intron-exon junction sequences of this gene arelikely to be conserved throughout vertebrates. Oligonucleotide primersflanking potential GATA-1 introns were designed and used to sequence thezebrafish genomic clones. Sequence analysis revealed that the zebrafishGATA-1 gene consists of five exons and four introns which lie within a6.5 kb genomic region (FIG. 1). Although the exon-intron number andjunction sequences are well conserved between zebrafish and othervertebrates, the zebrafish GATA-1 introns are smaller than in otherspecies.

Transient Expression of GFP Driven by the GATA-1 Promoter in ZebrafishEmbryos

Based on the zebrafish GATA-1 genomic structure, three GFP reporter geneconstructs were generated (FIG. 2). Construct G1-(Bgl)-GM2 was generatedby ligation of a modified GFP reporter gene (GM2) to a 5.4 kbEcoRI/BglII fragment that contains the 5′ flanking sequences upstream ofthe major GATA-1 transcription start site. Construct G1-GM2 contained a5.6 kb region upstream of the translation start of GATA-1. The thirdconstruct, G1-(5/3)-GM2, was generated by ligating an additional 4 kb ofGATA-1 genomic sequences, which contain intron and exon sequences, tothe 3′ end of the reporter gene in construct G1-GM2. Each construct wasmicroinjected into the cytoplasm of single cell zebrafish embryos. GFPreporter gene expression in the embryos was examined at a number ofdistinct developmental stages by fluorescence microscopy.

GFP expression was observed in embryos injected with either constructG1-GM2 or construct G1-(5/3)-GM2 as early as 80% epiboly, approximately8 hours post fertilization (pf). At that time, GFP positive cells wererestricted to the ventral region of the injected embryos. At 16 hourspf, GFP expression was clearly visible in the developing intermediatecell mass (ICM), the earliest hematopoietic tissue in zebrafish. After24 hours pf, GFP positive cells were observed in circulating blood andcould be continuously observed in circulating blood for several months.During the first five days pf, examination of circulating blood revealedtwo distinct cell populations with different levels of GFP expression.One cell type was larger and brighter; the other smaller and lessbright. No significant difference in GFP expression levels was detectedbetween embryos injected with either construct G1-GM2 or G1-(5/3)-GM2.However, injection of construct G1-(Bgl)-GM2 yielded very weak GFPexpression in developing embryos. This result indicated that either theGATA-1 transcription initiation site was removed by BglII restrictiondigestion, or that the 5′ untranslated region of zebrafish GATA-1 isrequired for high level tissue specific expression of GFP. It is notsurprising that a construct lacking the 5′ untranslated region of GATA-1did not generate much GFP expression in microinjected embryos. Theseregions are often needed for transcript stability. At times, theseregions also contain binding sites for regulators of gene expression.

At least 75% of the embryos injected with G1-GM2 or G1-(5/3)-GM2construct showed some degree of ICM specific GFP expression (Table 2).The number of GFP positive cells in the ICM or in circulation rangedfrom a single cell to a few hundred cells. Less than 7% of these embryosshowed GFP expression in non-hematopoietic tissues, usually limited tofewer than ten cells per embryo. Non-specific expression of GFP wasusually observed in the notochord, muscle, and enveloping cell layers,and was limited to no more than 10 cells per embryo. These observationsindicated that a genomic GATA-1 fragment extending approximately 5.6 kbupstream from the GATA-1 translation start site ligated to GFP sufficedto recapitulate the embryonic pattern of GATA-1 expression in zebrafish.

TABLE 2 No. embryos No. embryos No. embryos with strong with non- No.with GFP GFP specific observed expression in expression in expressionConstructs embryos ICM (%) ICM (%)^(a) GFP (%) G1-GM2 336 274 (81.5%)177 (52.7%) 15 (4.5%) G1-GM2 (5/3) 248 187 (75.4%) 150 (60.5%) 16 (6.5%)G1(Bg1II)-GM2 370 0 (0%)  0 (0%)  19 (5.1%) ^(a)Strong GFP expressionmeans that each embryo has more than 10 green fluorescent cells in theICM.

GFP Expression in Germline GATA-1/GFP Transgenic Zebrafish

Microinjected zebrafish embryos were raised to sexual maturity andmated. Progeny were tested by PCR to determine the frequency of germlinetransmission of the GATA-1/GFP transgene. Nine of six hundred andseventy two founder fish have transmitted GFP to the F1 generation.Examination of these fish by fluorescence microscopy revealed that sevenof eight lines expressed GFP in the ICM and in circulating blood cells.GFP expression patterns in the ICM were consistent with the RNA in situhybridization patterns previously observed for GATA-1 mRNA expression inzebrafish (Detrich et al., Proc Natl Acad Sci USA 92:10713-7 (1995)). Inthe two lines where F2 transgenic fish have been obtained, GFPexpression in blood cells was observed in 50% of the progeny when atransgenic F2 was mated to a non-transgenic fish. This indicated thatGFP was transmitted to progeny in a Mendelian fashion. Southern blotanalysis showed that GFP transgene insertions occurred at differentsites in these two lines. In one line, transgenic fish apparently carry4 copies of the transgene and in the other line, 7 copies.

Blood cells were collected from 48 hour transgenic fish by heartpuncture and a blood smear was observed by fluorescence microscopy. Twodistinct populations of fluorescent cells were observed in these smears.As in the circulation of embryos that transiently express GFP, one cellpopulation was observed that was large and bright and another that wassmaller and less bright. Although the blood cells collected from adulttransgenic zebrafish showed some variability in fluorescence intensity,they appeared to have uniform size. Blood cells collected fromnon-transgenic fish showed no fluorescence.

In two day old transgenic zebrafish, weak GFP expression was observed inthe heart. GFP expression was also observed in the eyes and, in three ofseven transgenic lines, in some neurons of the spinal cord. Expressionin the eyes peaked between 30 and 48 hours pf and became extremely weakby day 4. It is thought that expression of GFP in eyes and neurons mayreplicate the authentic GATA-1 expression pattern.

Examination of GFP expression in tissues of one month old fish showedthat the head kidney contained a large number of fluorescent cells. Thisresult suggests that the kidney is the site of adult erythropoiesis inzebrafish. It has been reported that GATA-1 is expressed in the testesof mice. Expression of GFP was not found in testes dissected from adultfish. It is possible that the disclosed GATA-1 transgene constructs lackan enhancer required for testis expression of GATA-1. Other tissuesincluding brain, muscle and liver had no detectable level of GFPexpression.

FACS Analysis of GATA-1/GFP Transgenic Fish

GFP expression in GATA-1/GFP transgenic fish allowed isolation of a purepopulation of the earliest erythroid progenitor cells for in vitrostudies by fluorescence activated cell sorting. F1 transgenic embryoswere collected at the onset of GFP expression and cell suspensions wereprepared. Approximately 3.6% of the cell populations of whole transgenicfish were fluorescence positives as compared to 0.12% in thenon-transgenic controls. Based on the number of embryos used, FACSanalysis suggested that there are approximately three hundred erythroidprogenitor cells per embryo at 14 hours pf.

To determine whether the FACS purified cells are enriched for GATA-1,RNA was isolated from these cells and GATA-1 mRNA levels were determinedby RT-PCR. The results indicated that these cells were highly enrichedfor GATA-1 mRNA.

Erythroid specific expression was observed in living embryos duringearly development. Fluorescent circulating blood cells were detected inmicroinjected embryos 24 hours after fertilization and could still beobserved in two month old fish. Germline transgenic fish obtained fromthe injected founders continued to express GFP in erythroid cells in theF1 and F2 generations. The GFP expression patterns in transgenic fishwere consistent with the RNA in situ hybridization pattern generated forGATA-1 mRNA expression. These transgenic fish allowed isolation, byfluorescence activated cell sorting, the earliest erythroid progenitorcells from developing embryos. Using constructs containing otherzebrafish promoters and GFP, it will be possible to generate transgenicfish that allow continuous visualization of the origin and migration ofany lineage specific progenitor cells in a living embryo.

The results described in this example indicate that monitoring GFPexpression can be a more sensitive method than RNA in situ detection bywhich to determine gene expression patterns. For instance, in thedisclosed GATA-1/GFP transgenic fish, GFP expression in circulatingblood allowed two types of cells to be distinguished. One cell type waslarger and brighter; the other smaller and less bright. There were fewerof the larger, brighter cell type. These cells are believed to beerythroid precursors while the more abundant, smaller cells are believedto be fully differentiated erythrocytes. Preliminary celltransplantation experiments with embryonic blood cells have shown thatthey contain a cell population that has long-term proliferationcapacity.

In two day old transgenic zebrafish, GFP expression was observed in theheart. In adult transgenic zebrafish, GFP expression was observed in thekidney. By histological methods, it has been shown that the heartendocardium is a transitional site for hematopoiesis in embryoniczebrafish and that the kidney is the site of adult hematopoiesis(Al-Adhami and Kunz, Develop. Growth and Differ. 19:171-179 (1977)). Theresults in GATA-1/GFP transgenic fish support these observations.

The GFP expression seen in the eyes and neurons of embryonic transgenicfish may be due to a lack of a transcriptional silencer in the transgeneconstructs. It seems unlikely that the GFP expression in the eyes is dueto positional effects caused by the sites of insertion since all seventransgenic lines have GFP expression in embryonic fish eyes.

Using fluorescence activated cell sorting, pure populations ofhematopoietic progenitor cells were isolated from the ICM of transgeniczebrafish. Since approximately 10⁷ cells can be sorted per hour, 10⁵ to10⁶ purified ICM cells can be obtained in a few hours. These cells,which are derived from the earliest site of hematopoiesis in zebrafish,can be used in a variety of in vitro studies. For instance, these purecell populations can provide mRNA for differential display orsubtractive screens for identifying novel hematopoietic genes. Erythroidprecursors obtained from the ICM might also be established in tissueculture. This would allow the growth factor needs of these cells to bedetermined.

The approach to obtaining and studying transgene expression in erythroidcells described above is generally applicable to the study of anydevelopmentally regulated process. This approach can also be applied tothe identification of cis-acting promoter elements that are required fortissue specific gene expression (see Example 2). The analysis ofpromoter activity in a whole animal is desirable since dynamic temporaland spatial changes in a cellular microenvironment can be only poorlymimicked in vitro. The ease of generating and maintaining a large numberof transgenic zebrafish lines makes obtaining statistically significantresults practical. Finally, transgenic zebrafish that express GFP inspecific tissues provide useful markers for identifying mutations thataffect these lines in genetic screens. Given the genetic resources andembryological methods available for zebrafish, transgenic zebrafishexhibiting tissue-specific GFP expression is a very valuable tool fordissecting developmental processes.

Example 2 Identification of Enhancers in GATA-2 Expression Sequences

A large number of studies have shown that neuronal cell determination ininvertebrates occurs in progressive waves that are regulated bysequential cascades of transcription factors. Much less is known aboutthis process in vertebrates. It was realized that an integrated approachcombining embryological, genetic and molecular methods, such as thatused to study neurogenesis in Drosophila (Ghysen et al., Genes & Dev7:723-33 (1993)), would facilitate the identification of the molecularmechanisms involved in specifying neuronal fates in vertebrates. Thefollowing is an example of identification of cis-acting sequences thatcontrol neuron-specific gene expression in a vertebrate. Suchidentification is an initial step toward unraveling similar cascades ina vertebrate.

Transcription factors bind to cis-acting DNA sequences (sometimesreferred to as response sequences) to regulate transcription. Oftenthese transcription factors are members of multigene families that haveoverlapping, but distinct, expression patterns and functions. Thetranscription factor GATA-2 is a member of such a gene family (Yamamotoet al., Genes Dev 4:1650-62 (1990)). Each member of the GATA gene familyis characterized by its ability to bind to cis-acting DNA elements withthe consensus core sequence WGATAR (Orkin, Blood 80:575-81 (1992); SEQID NO:18). All protein products of the GATA family contain two copies ofa highly conserved structural motif, commonly known as a zinc finger,which is required for DNA binding (Martin and Orkin, Genes Dev 4:1886-98(1994)). Six members of the GATA family have been identified invertebrates (Orkin, Blood 80:575-81 (1992), Orkin, Curr Opin Cell Biol7:870-7 (1995)). Pannier, another member of the GATA gene family, isexpressed in Drosophila neuronal precursors and inhibits expression ofachaete-scute, a gene complex that plays a critical role in neurogenesisin Drosophila (Ramain et al., Development 119:1277-91 (1993)).

In chicken and mouse, the transcription factor GATA-2 is expressed inhematopoietic precursors, immature erythroid cells, proliferating mastcells, the central nervous system (CNS), and sympathetic neurons(Yamamoto et al., Genes & Dev 4:1650-62 (1990), Orkin, Blood 80:575-81(1992), Jippo et al., Blood 87:993-8 (1996)). Studies in zebrafish(Detrich et al., Proc Natl Acad Sci USA 92:10713-7 (1995)) and Xenopus(Zon et al., Proc Natl Acad Sci USA 88:19642-6 (1991), Kelley et al.,Dev Biol 165:193-205 (1994)) have also shown that GATA-2 expression isrestricted to hematopoietic tissues and the CNS. Homozygous nullmutants, created in mouse via homologous recombination, have profounddeficits in all hematopoietic lineages (Tsai et al., Nature 371:221-6(1994)). The role played by GATA-2 in neuronal tissue of these mice hasnot been carefully examined, perhaps because the embryos die before dayE11.5. Analysis of GATA-2 expression in chick embryonic neuronal tissueafter notochord ablation has suggested that GATA-2 plays a role inspecifying a neurotransmitter phenotype (Groves et al., Development121:887-901 (1995)). In addition, GATA factors are required for activityof the neuron-specific enhancer of the gonadotropin-releasing hormonegene (Lawson et al., Mol Cell Biol 16:3596-605 (1996)).

The effects of various hematopoietic growth factors on GATA-2 expressionhas been carefully studied in tissue culture systems (Weiss et al., ExpHematol 23:99-107 (1995)) and some growth factors have been shown tohave dramatic effects on early embryonic GATA-2 expression (Walmsley etal., Development 120:2519-29 (1994), Maeno et al., Blood 88:1965-72(1996)). In addition, nuclear translocation of a maternally suppliedCCAAT binding transcription factor has been shown to be necessary forthe onset of GATA-2 transcription at the mid-blastula transition inXenopus (Brewer et al., Embo J 14:757-66 (1995)). However, prior to thedisclosed work, nothing was known about the mechanisms that controlneuron-specific expression of this gene.

Cloning and Sequencing of 5′ Part of GATA-2 Genomic DNA

A zebrafish genomic phage library was screened with the conserved zincfinger domain of zebrafish GATA-2 cDNA radiolabeled with ³²P. Twopositive clones, λGATA-21 and λGATA-22, were identified. Restrictionfragments of λGATA-21 were subcloned into pBluescript II KS(−). DNAsequence of the resulting clones was obtained from −4807 to +2605relative to the GATA-2 translation start. Nucleotide sequence of theGATA-2 promoter region is shown in SEQ ID NO:27. Unless otherwiseindicated, positions within the GATA-2 clones use this numbering. The7.3 kb region upstream of the translation start in λGATA-21 wasamplified by the polymerase chain reaction (PCR) using Expand™ LongTemplate PCR System (Boehringer Mannheim) for 25 cycles (94° C., 30seconds; 68° C., 8 minutes). Primers used were a T7 primer and a primerspecific for sequences 5′ to the GATA-2 translation start site(5′-ATGGATCCTCAAGTGTCCGCGCTTAGAA-3′; SEQ ID NO:19). The GATA-2 specificprimer contained a BamHI site to facilitate subsequent cloning. The PCRproduct (P1) was cloned into the SmaI/BamHI sites of pBluescript IIKS(−).

Plasmid Constructs

The 7.3 kb DNA fragment containing the putative GATA-2 expressionsequences (P1) was ligated to a modified GFP reporter gene (GM2,described above), resulting in construct P1-GM2 (FIG. 3). Based onP1-GM2, constructs containing successive 5′ deletions in the regionupstream of the transcription start site were generated using therestriction sites PstI, SacI, AatII, ClaI and ScaI in this upstreamregion (FIG. 3). Constructs nsP5-GM2 and nsP6-GM2 were generated byligating the 1116 bp fragment containing the GATA-2 neuron-specificenhancer from −4807 to −3690 to P5-GM2 and P6-GM2, respectively (FIG.4). The same fragment containing the neuron-specific enhancer was alsoligated to a 243 bp SphI/BamHI fragment of the Xenopus elongation factor1α (EF 1α) minimal promoter that had previously been ligated to the GM2gene, resulting in construct ns-XS-GM2 (FIG. 4). The EF 1α minimalpromoter has been described in Johnson and Krieg, Gene 147:223-6 (1994).

PCR Mapping of Neuron-Specific Enhancer

PCR technology was exploited to create a deletion series within the 1116bp neuron-specific enhancer using nsP5-GM2 as a template. A total of 10specific 22-mer primers were synthesized. These included ns4647, ns4493,ns4292, ns4092, ns3990, ns3872, ns3851, ns3831, ns3800 and ns3789, inwhich the numbers refer to the positions of their 5′ end base in theGATA-2 genomic sequence. A T7 primer was also used in the PCR reactions.The amplified fragments all contained the GM2 gene and SV40polyadenylation signal in addition to the GATA-2 expression sequences.PCR reactions were performed using Expand™ Long Template PCR System(Boehringer Mannheim) for 25 cycles (94° C., 30 seconds; 55° C., 30seconds; 72° C., 2 minutes). The PCR products were purified withGENECLEAN II Kit (Bio 101 Inc.) and subsequently used formicroinjection.

After a 31 bp neural-specific enhancer was identified, five additionalprimers, each containing 2 or 3 mutant bases relative to the wild typeenhancer sequence, were designed. These primers are (the mutant basesare underlined):

(SEQ ID NO: 20) ns3831 5′-TCTGCGCCGCTTTCTGCCCCCTCCTGCCCTCTT-3′ (SEQ IDNO: 21) ns3831M1 5′-TCTGCGAAGCTTTCTGCCCCCTCCTGCCCTCTT-3′ (SEQ ID NO: 22)ns3831M2 5′-TCTGCGCCGCTTTCTGAACCCTCCTGCCCTCTT-3′ (SEQ ID NO: 23)ns3831M3 5′-TCTGCGCCGCTTTCTGCCAACTCCTGCCCTCTT-3′ (SEQ ID NO: 24)ns3831M4 5′-TCTGCGCCGCTTTCTGCCCCAAACTGCCCTCTT-3′ (SEQ ID NO: 25)ns3831M5 5′-TCTGCGCCGCTTTCTGCCCCCTCCTGCCCTCTT-3′These primers were used in conjunction with the T7 primer for PCRamplification of the target sequence using the nsP5-GM2 as the template.PCR conditions were identical to those described above.

Microinjection of Zebrafish

Wild-type zebrafish were used for all microinjections. Plasmid DNA waslinearized using single-cut restriction sites in the vector backbone,purified using GENECLEAN II Kit (Bio 101 Inc.), and resuspended in 5 mMTris, 0.5 mM EDTA, 0.1 M KCl at a final concentration of 100 μg/ml.Single cell embryos were microinjected as described above. Eachconstruct was injected independently 2 to 5 times and the data obtainedwere pooled.

Fluorescent Microscopic Observation

Embryos were anesthetized using tricaine as described above and examinedunder a FITC filter on a Zeiss microscope equipped with a video camera.Pictures showing GFP positive cells in living embryos were generated bysuperimposing a bright field image on a fluorescent image using AdobePhotoshop software.

Whole-Mount RNA in Situ Hybridization

Sense and antisense digoxigenin-labeled RNA probes were generated from aGATA-2 cDNA subclone containing a 1 kb fragment of the 5′ codingsequence using DIG/Genius™ 4 RNA Labeling Kit (SP6/T7) (BoehingerMannheim). RNA in situ hybridizations were performed as described byWesterfield (The Zebrafish Book (University of Oregon Press, 1995)).

Isolation of GATA-2 Genomic DNA

Two GATA-2 positive phage clones, λGATA-21 and λGATA-22, were identifiedas described above. Preliminary restriction analysis suggested thatλGATA-21 contained a large region upstream of the translation startcodon. 7412 bp of this clone was sequenced from −4807 to +2605 relativeto the translation start site. The putative GATA-2 expression sequences(P1) containing approximately 7.3 kb upstream of the translation startsite from the λGATA-21 was subcloned into a plasmid vector forexpression studies.

Expression Pattern of a Modified GFP Gene Driven by the Putative GATA-2Promoter in Zebrafish Embryos

The construct P1-GM2 was generated by ligation of a modified GFPreporter gene (GM2) to P1 (FIG. 3). This construct was injected into thecytoplasm of single cell zebrafish embryos and GFP expression in themicroinjected embryos was examined at a number of distinct developmentalstages by fluorescence microscopy.

GFP expression was initially observed by fluorescence microscopy at the4000 cell stage at about 4 hours post-injection (pi). At the dorsalshield stage (6 hours pi), GFP expression was observed throughout theprospective ventral mesoderm and ectoderm but expression in the dorsalshield was extremely rare. At 16 hours pi, GFP expression was observedin the developing intermediate cell mass (ICM), the early hematopoietictissue of zebrafish. In addition, GFP expression could be seen insuperficial EVL cells at 4 hours pi. Expression in the EVL peakedbetween 24 and 48 hours pi and became extremely weak by day 7. GFPexpression in neurons, including extended axons, was first observed at30 hours pi and was maintained at high levels through at least day 8.

Embryos injected with the P1-GM2 construct expressed GFP in a mannerrestricted to hematopoietic cells, EVL cells, and the CNS. The GFPexpression patterns in gastrulating embryos, in the blood progenitorcells, and in neurons were consistent with the RNA in situ hybridizationpatterns previously generated for GATA-2 mRNA expression in zebrafish(Detrich et al., Proc Natl Acad Sci USA 92:10713-7 (1995)). However,GATA-2 expression in EVL has not been detected by RNA in situhybridizations.

More than 95% of the embryos injected with P1-GM2 had tissue specificGFP expression (Table 3). About 5% of these embryos had non-specific GFPexpression, limited to fewer than five cells per embryo. Theseobservations indicated that the DNA fragment extending approximately 7.3kb upstream from the GATA-2 translation start site sufficed to correctlygenerate the embryonic tissue-specific pattern of GATA-2 geneexpression.

TABLE 3 No. embryos with No. embryos No. No. circulating with embryosNo. embryos blood neuronal with EVL embryos with expression expressionexpression Construct observed expression (%) (%) (%) P1-GM2 141 135  3(2.13)  106 (75.2) 130 (92.2) P2-GM2 198 177 32 (15.7)  136 (68.7) 175(88.4) P3-GM2 303 291 29 (9.6)  0 (0) 277 (91.4) P4-GM2 143 126 21(14.7) 0 (0) 118 (82.5) P5-GM2 139 90 16 (11.5) 0 (0)  20 (14.4) P6-GM2138 44 2 (1.4) 0 (0) 11 (8.0)

Gross Mapping of Tissue-Specific Enhancers

To identify the portions of the GATA-2 expression sequences that areresponsible for regulating tissue specific gene expression, severalconstructs containing deletions in the promoter were generated (FIG. 3).Naturally occurring restriction sites were used to create a series ofgross deletions in the expression sequence region. Each construct wasindividually microinjected into single cell embryos. The developingembryos were observed by fluorescence microscopy at regular intervalsfor several days.

Embryos injected with P2-GM2, which contains GATA-2 sequences from −4807to +1, expressed GFP in a manner similar to embryos injected with theoriginal construct, P1-GM2 (Table 3). At 48 hr pi, GFP expression wasobserved in circulating blood cells, the CNS and the EVL. However,careful observation of the injected embryos at 16 hr pi revealed thatexpression in the posterior end of the ICM was nearly abolished. Thissuggested that an enhancer for GATA-2 expression in early hematopoieticprogenitor cells may reside in the deleted region. Expression of GFP incirculating blood cells increased from approximately 2% to 16%,suggesting that a potential repressor for expression of GATA-2 inerythrocytes may also reside in the deleted region.

Embryos injected with P3-GM2, which contains GATA-2 sequences from −3691to +1, expressed GFP in circulating blood cells and in the EVL, but didnot express in the CNS. Embryos injected with other constructs that lackthe deleted 1116 bp region, extending from −4807 to −3692, also had noGFP expression in the CNS (Table 3). It was concluded that the 1116 bpregion, extending from −4807 to −3692, contained a neuron-specificenhancer element.

Embryos injected with P4-GM2, which contains GATA-2 sequences from −2468to +1, had a GFP expression pattern similar to those injected withP3-GM2. Injection with P5-GM2, which contains GATA-2 sequences from−1031 to +1, resulted in a sharp drop with respect to percentage ofembryos expressing GFP in the EVL, but GFP expression in circulatingblood cells was unaffected. This indicates that the 1437 bp region,extending from −2468 to −1032, contains an EVL-specific enhancer. The1031 bp segment present in P5-GM2 may represent the minimal expressionsequences necessary for the maintenance of tissue specific expression ofGATA-2.

Neuron-Specific Enhancer Activity

To confirm the neuron-specific enhancer activity of the 1116 bp regionthat spans from −4807 to −3692 of GATA-2, nsP5-GM2 was constructed byligating the 1116 bp fragment to P5-GM2, which contains the 1031 bpregion upstream of the translation start of GATA-2 gene operably linkedto a sequence encoding GM2 (FIG. 4). Approximately 70% of the embryosinjected with nsP5-GM2 had GFP expression in the CNS (FIG. 5), while noembryos injected with P5-GM2 had GFP expression in the CNS as noted inTable 3. This indicates that the 1116 bp region can effectively directneuron-specific expression.

To determine whether the 1116 bp neuron-specific enhancer activity wascontext dependent, the construct ns-Xs-GM2 (FIG. 4) was generated byligating the enhancer to the Xenopus elongation factor la minimalpromoter (Johnson and Krieg, Gene 147:223-6 (1994)) operably linked tothe sequence encoding GM2 (Xs-GM2; FIG. 4). When injected with Xs-GM2,embryos expressed GFP in various tissues including muscle, notochord,blood cells and melanocytes. However, no GFP expression was observed inthe CNS (FIG. 5). Injection with ns-XS-GM2 resulted in 8.5% of theembryos having GFP expression in the CNS, far less than obtained byinjection with nsP5-GM2 (FIG. 5). Another construct, nsP6-GM2 (FIG. 4),had an additional 653 bp deletion in the GATA-2 minimal expressionsequence, extending from −1031 to −378. Injection of nsP6-GM2 resultedin 6.2% of embryos expressing GFP in the CNS (FIG. 5). Injection withP6-GM2 resulted in no GFP expression in the CNS (Table 3). These resultssuggests that the 1116 bp enhancer has some ability to confer neuronalspecificity on a heterogeneous promoter, but requires proximal elementswithin its own promoter to exert its full activity.

Fine Mapping of a Neuron-Specific Cis-Acting Regulatory Element

To precisely map the putative neuron-specific enhancer, a series ofconstructs containing progressive deletions in the 1116 bp DNA fragmentwas generated by PCR, using nsP5-GM2 as the template. The PCR productsobtained were used directly for microinjection. The first deletionseries included ns4647, ns4493, ns4292, ns4092 and ns3990 (where thenumber indicates the upstream endpoint of the deleted fragment).Microinjection of all 5 mutants gave a similar percentage of embryoshaving GFP expression in the CNS (FIG. 6). This indicated that aneuron-specific enhancer resides within the 298 bp sequence (from −3990to −3692) contained in ns3990.

Next, two additional deletion constructs, ns3872 and ns3789, weregenerated. As shown in FIG. 6, over 60% of embryos injected with ns3872had GFP expression in the CNS, while embryos injected with ns3789 lackedGFP expression in the CNS. This indicated that the neuron-specificenhancer element was located within a 83 bp sequence from −3872 to−3790.

Injection of embryos with three additional deletion constructs ns3851,ns3831 and ns3800 allowed localization of the neuron-specific enhancerelement to a 31 bp pyrimidine-rich sequence. This element has thesequence 5′-TCTGCGCCGCTTTCTGCCCCCTCCTGCCCTC-3′ (nucleotides 1 to 31 ofSEQ ID NO:20), which extends from −3831 to −3801 within the GATA-2genomic DNA.

Site Directed Mutagenesis Within Neuron-Specific Enhancer Element

To determine the core sequence necessary for the activity of theneuron-specific element, five primers, each having two to three alterednucleotides within the 31 bp neuron-specific element (see above), wereused to amplify nsP5-GM2. The PCR products obtained were directlyinjected into single cell embryos. This 31 bp sequence contains anEts-like recognition site (AGGAC) in an inverted orientation which ispresent in several neuron-specific promoters (Chang and Thompson, J.Biol Chem 271:6467-75 (1996), Charron et al., J. Biol Chem 270:30604-10(1995)). Therefore, four of the primers used in these PCR reactionscontain altered nucleotides within the Ets-like recognition site or inthe adjacent sequence. As expected, embryos injected with ns3831M1,which contains two mutant nucleotides that are thirteen nucleotidesupstream of the Ets-like recognition site, showed little change inneuron-specific GFP expression (FIG. 7). A mutation of 2 nucleotides(ns3831M2) that lie three nucleotides upstream of the Ets-likerecognition site had no effect on enhancer activity (FIG. 7). Mutationof two nucleotides just one nucleotide upstream of the Ets-like motif,contained in ns3831M3, completely eliminated the neuron-specificenhancer activity of the 31 bp element (FIG. 7). Mutation of threenucleotides (ns3831M4), of which two lie within the Ets-like recognitionsite, also resulted in a sharp decrease in enhancer activity (FIG. 7). Amutation of two nucleotides that lie within the Ets-like recognitionsite (ns3831M5) reduced the neuron-specific enhancer activity of the 31bp element by approximately 50% (FIG. 7). From this it was concludedthat a CCCTCCT motif, which partially overlaps the Ets-like recognitionsite within the 31 bp sequence, is absolutely required forneuron-specific enhancer activity.

This dissection of expression sequences using transgenic fish,exemplified in zebrafish and with GATA-2 as described above, provides asystem that allows the rapid and efficient identification of thosecis-acting elements that play key roles in modulating the expression ofdevelopmentally regulated genes. Identification of these cis-actingelements is a useful step toward determining the genes that operateearlier than the gene under study in the specification of adevelopmental pathway (since the identified distal regulatory elementsinteract with transcription factors which must be expressed for theregulatory elements to function).

Careful analysis of GATA-2 promoter activity in zebrafish embryosrevealed three distinct tissue specific enhancer elements. These threeelements appear to act independently to enhance gene expressionspecifically in blood precursors, the EVL, or the CNS. Deletion of oneor two of the elements will generate transgene constructs that can driveexpression of a gene of interest in a specific tissue. Such constructsalso allow study of the tissue-specific function of genes expressed inmultiple tissues.

It has been shown that the developmental regulation of the mammalianHOX6 and GAP-43 promoter activities is conserved in zebrafish(Westerfield et al., Genes Dev 6:591-8 (1992), Reinhard et al.,Development 120:1767-75 (1994)). If the same neuron-specific elementidentified in the zebrafish GATA-2 promoter is also shown to be requiredfor neuron-specific activity of the mouse promoter, one couldspecifically knockout expression of GATA-2 in the mouse CNS by targetingthis cis-element. This would allow one to determine precisely the rolethat GATA-2 plays in the CNS.

The neuron-specific enhancer element of GATA-2 has been precisely mappedand found to contain the core DNA consensus sequence for binding byEts-related transcription factors. Although Ets-related factors havebeen implicated in the regulation of expression of a number ofneuron-specific genes (Chang and Thompson, J. Biol Chem 271:6467-75(1996), Charron et al., J. Biol Chem 270:30604-10 (1995)), anothersequence, CCTCCT, present in this region of the zebrafish GATA-2promoter was found to be required for expression in the CNS. This motifpartially overlaps an inverted form of the core sequence of the Ets DNAbinding recognition site. As has been shown for other genes, theactivities of Ets family proteins often rely more on their ability tointeract with other transcription factors than on specific binding to acognate DNA sequence (Crepieux et al., Crit Rev Oncog 5:615-38 (1994)).It is possible that an independent factor that binds to the CCTCCT motifis required for neuron-specific activity of the GATA-2 promoter.

A number of growth factors are known to affect early embryonicexpression of GATA-2. Noggin and activin, which both have dorsalizingactivity in Xenopus embryos, downregulate GATA-2 expression in dorsalmesoderm (Walmsley et al., Development 120:2519-29 (1994)). BMP-4activates GATA-2 expression in ventral mesoderm and is probablyimportant to early blood progenitor proliferation (Maeno et al., Blood88:1965-72 (1996)). Growth factors that might affect expression ofGATA-2 in neurons are not known. However, both BMP-2 and BMP-6 canactivate neuron-specific gene expression (Fann and Patterson, J.Neurochem 63:2074-9 (1994)). Consistent with studies on growth factorsthat upregulate or downregulate GATA-2 expression, GATA-2 promoteractivity was excluded from the zebrafish dorsal shield. It has also beendiscovered that lithium chloride treatment dorsalizes the injectedembryos and dramatically reduces GATA-2 promoter activity as determinedby GFP expression.

Although GATA-2 expression has not been observed in the EVL by in situhybridization on whole embryos, this may be due to the conditions used.In mouse, embryonic mast cells present in the skin have only beendetected by in situ hybridization performed on skin tissue sections(Jippo et al., Blood 87:993-8 (1996)). Interestingly, expression ofGATA-2 in mouse skin mast cells occurs only during a short period ofembryogenesis, similar to what has been found for EVL cells inzebrafish. It is possible that the constructs used in this example maybe missing elements that would specifically silence GATA-2 expression inthe zebrafish EVL.

The method described above is generally applicable to the dissection ofany developmentally regulated vertebrate promoter. Tissue specific andgrowth factor response elements can be rapidly identified in thismanner. The fact that zebrafish typically produce hundreds of fertilizedeggs per mating facilitates obtaining statistically significant results.While tissue culture systems have been useful for identifying manyimportant transcription factors, transfection analysis in tissue culturecells cannot simulate the complex, rapidly changing microenvironment towhich the promoter must respond during embryogenesis. Temporal andspatial analysis of promoter activity can be only poorly mimicked invitro. The system described herein allows complete analysis of promoteractivity in all tissues of a whole vertebrate.

1. A transgenic fish that expresses an expression product, comprising ahomologous cell lineage-specific expression sequence operably linked toan exogenous nucleic acid sequence encoding the expression product,wherein the cell lineage expression sequence and the exogenous sequenceencoding the expression product are integrated into the genome of thefish, wherein the expression product encoded by the transgenic sequenceexhibits cell lineage-specific expression, and wherein the fish isselected from the group consisting of salmon, trout, tuna, halibut,catfish, zebrafish, medaka, carp, tilapia, goldfish and loach.
 2. Thetransgenic fish of claim 1 wherein the expression sequence and thesequence encoding the expression product are not operably linked innature.
 3. The transgenic fish of claim 1 wherein the expression productis heterologous.
 4. The transgenic fish of claim 3 wherein theexpression product is a reporter protein.
 5. The transgenic fish ofclaim 4 wherein the reporter protein is selected from the groupconsisting of β-galactosidase, chloramphenicol acetyltransferase, andgreen fluorescent protein.
 6. The transgenic fish of claim 5 wherein thereporter protein is green fluorescent protein.
 7. (canceled)
 8. Thetransgenic fish of claim 1 wherein the fish is zebrafish.
 9. Thetransgenic fish of claim 1 wherein the expression product is expressedonly in cells selected from the group consisting of blood cells, nervecells, and skin cells. 10-12. (canceled)
 13. The transgenic fish ofclaim 1 wherein the expression sequence is selected from the groupconsisting of GATA-1 expression sequence and a GATA-2 expressionsequence. 14-18. (canceled)
 19. The transgenic fish of claim 1 whereinthe transgenic fish developed from, or is the progeny of a transgenicfish developed from, an embryonic cell into which the sequence encodingthe expression product was introduced.
 20. The trangenic fish of claim 1wherein the expression product is expressed only in predetermined celllineages.
 21. The transgenic fish of claim 1 wherein the sequenceencoding the expression product is genetically linked to an identifiedmutant gene.
 22. The transgenic fish of claim 1 wherein the expressionsequence comprises a homologous promoter operably linked to a homologousenhancer.
 23. (canceled)
 24. The transgenic fish of claim 1 furthercomprising (a) intron sequences operably linked to the sequence encodingthe expression product, (b) a polyadenylation signal operably linked tothe sequence encoding the expression product, or both.
 25. Cellsisolated from the transgenic fish of claim 1 wherein the cells expressthe expression product.
 26. A method of making a transgenic fish, themethod comprising (a) introducing an exogenous construct into an eggcell or embryonic cell of a fish, wherein the construct comprises ahomologous cell lineage-specific expression sequence operably linked toa sequence encoding an expression product, and (b) allowing the egg cellor embryonic cell to develop into a fish, (c) crossing the fish of (b)with a second fish to produce a third fish having the exogenousconstruct, wherein the expression product exhibits cell lineage-specificexpression in the third fish and wherein the sequence encoding theexpression product is integrated into the genome of the third fish, andwherein the fish is selected from the group consisting of salmon, trout,tuna, halibut, catfish, zebrafish, medaka, carp, tilapia, goldfish andloach. 27-29. (canceled)
 30. A method of identifying a test compoundthat affects expression or the pattern of expression of a fish genecomprising (a) exposing the fish or transgenic progeny of the fish madeby the method of claim 26 to a test compound, (b) detecting theexpression product in the fish exposed to the test compound, and (c)comparing the expression or pattern of expression of the expressionproduct in the fish exposed to the test compound with the expression orpattern of expression of the expression product in the fish or progenyof the fish not exposed to the test compound, wherein if the expressionor pattern of expression of the expression product in the fish exposedto the test compound differs from expression or the pattern ofexpression in the fish not exposed to the test compound, then the testcompound affects expression or the pattern of expression of the fishgene.
 31. A method of identifying the pattern of expression of a fishgene comprising detecting the expression product in the fish ortransgenic progeny of the fish made by the method of claim 26, whereinthe pattern of expression of the expression product in the fish orprogeny of the fish identifies the pattern of expression of the fishgene.
 32. A method of identifying a mutant gene that affects expressionor the pattern of expression of a fish gene comprising (a) crossing thefish or transgenic progeny of the fish made by the method of claim 26 toa second fish having an identified mutant gene to produce a third fishhaving both the exogenous construct and the identified mutation, (b)detecting the expression product in the third fish or progeny of thethird fish, and (c) comparing the expression or pattern of expression ofthe expression product in the third fish or the progeny of the thirdfish with the expression or pattern of expression of the expressionproduct in the second fish, wherein if the expression or pattern ofexpression of the expression product in the third fish or progeny of thethird fish differs from the expression or pattern of expression in thesecond fish, then the mutant gene affects expression or the pattern ofexpression of the fish gene.
 33. A method of marking a mutant genecomprising (a) crossing the fish or transgenic progeny of the fish madeby the method of claim 26 to a second fish having an identified mutantgene, wherein the exogenous construct and the mutant gene map to thesame region of the genome, to produce a third fish having both theexogenous construct and the mutant gene, and (b) crossing the third fishto a fourth fish, wherein the fourth fish has neither the exogenousconstruct nor the mutant gene, to produce a fifth fish, wherein thefifth fish has both the exogenous construct and the mutant gene, whereinthe mutant gene is marked by the exogenous construct in the fifth fish.34-40. (canceled)
 41. The transgenic fish of claim 1 wherein expressionof the expression product is stable and transmitted through thegermline.
 42. The transgenic fish of claim 1, wherein the homologousexpression sequence and the sequence encoding the expression product arecontained in an exogenous construct.
 43. A method of identifying genes,the method comprising: (a) isolating from the fish of claim 1 or progenythereof a cell expressing the cell lineage-specific expression product,and (b) identifying genes that are expressed in the cell
 44. The methodof claim 43, wherein the gene is identified by extracting RNA andperforming differential display.
 45. The method of claim 43, wherein thegene is identified by extracting RNA and performing subtractivehybridization.
 46. A method for identifying genes involved in celllineage specific expression comprising: (a) Introducing a mutation intothe fish of claim 1 and (b) Detecting a change in the expression orpattern of expression of the expression product in the fish containingthe mutation, whereby a change in the expression or pattern ofexpression product identifies a gene involved in cell lineage-specificexpression.
 47. A method of identifying compounds that affect expressionof genes comprising: (a) Contacting a first fish of claim 1 with acompound, wherein the expression product is a directly detectablereporter protein; (b) Comparing the expression or pattern of expressionof the expression product in the fish contacted with the compound with asecond fish of claim 1 that was not contacted with the compound; (c)Determining the effect of the compound on the expression or pattern ofexpression of expression product wherein an effect on the expression orpattern of expression of the expression product indicates that thecompound affects expression of genes.
 48. The transgenic fish of claim1, wherein the fish is a transgenic progeny of the fish into which theexogenous sequence encoding the expression product was introduced. 49.The transgenic fish of claim 48 wherein the fish is zebrafish.